-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimize sequence number calculation and reduce search requests in doc level monitor execution #1445
Conversation
…where n is number of shards being queried in the executino Signed-off-by: Surya Sashank Nistala <[email protected]>
Signed-off-by: Surya Sashank Nistala <[email protected]>
…es and datastreams during monitor creation Signed-off-by: Surya Sashank Nistala <[email protected]>
Signed-off-by: Surya Sashank Nistala <[email protected]>
Signed-off-by: Surya Sashank Nistala <[email protected]>
@@ -615,7 +620,7 @@ class DocumentLevelMonitorRunner : MonitorRunner() { | |||
) | |||
} | |||
|
|||
private suspend fun updateLastRunContext( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure we do not need suspend anymore? I recall we needed it before
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we needed to do suspend earlier because we had a search call to calculate max seq_no being made within the method.
we have removed the search call as we calculate the max seq_no in the first pass. hence no need of a suspend
val prevSeqNo = indexExecutionCtx.lastRunContext[shard].toString().toLongOrNull() | ||
val from = prevSeqNo ?: SequenceNumbers.NO_OPS_PERFORMED | ||
var to: Long = Long.MAX_VALUE | ||
while (to >= from) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this while logic? Also it seems like we reduce the to
by 10000 from the current fetched seqNo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
take an example of one shard
in current iteration: previous seq_no = 10000
, current max seq_no= 40000
first iteration of search shard(which also determines the max sequence number) : TO=int_max
, FROM=prev_seq_no
: calculate max_seq number
second iteration : TO=max_seq_no-10000
FROM=prev_seq_no
third iteration : TO=TO-10000
FROM=prev_seq_no
in first iteration we get first 10k results and also calculate the max sequence number
in second iteration we reduce (TO) variable to maxSeqNo -10000
and query again and we continue this loop until we have seen all docs up untill the previous sequence number
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
made a change to set TO
variable of next iteration as 1 less than least seq no from list of docs fetched in current search
monitor.id, | ||
conflictingFields, | ||
updateLastRunContext(shard, hits.hits[0].seqNo.toString()) | ||
to = hits.hits[0].seqNo - 10000L |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we only fetching 10,000 docs and is that why we chose 10,000? If so, this is not right. Sequence numbers are incremented for every new write operation on the index and that includes deletion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
earlier logic was completely wrong
it calculated the max sequence number and assumed it will always be less than (prev_seq_no+10000) ans hence made only one search call with a hard coded size of from=prev_seq to = max_seq but size =10k
explained the current logic in the above comment: #1445 (comment)
Signed-off-by: Surya Sashank Nistala <[email protected]>
…c level monitor execution (opensearch-project#1445) * optimize sequence number calculation and reduce search requests by n where n is number of shards being queried in the executino Signed-off-by: Surya Sashank Nistala <[email protected]> * fix tests Signed-off-by: Surya Sashank Nistala <[email protected]> * optimize check indices and execute to query only write index of aliases and datastreams during monitor creation Signed-off-by: Surya Sashank Nistala <[email protected]> * fix test Signed-off-by: Surya Sashank Nistala <[email protected]> * add javadoc Signed-off-by: Surya Sashank Nistala <[email protected]> * add tests to verify seq_no calculation Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]>
* Add jvm aware setting and max num docs settings for batching docs for percolate queries (#1435) * add jvm aware and max docs settings for batching docs for percolate queries Signed-off-by: Surya Sashank Nistala <[email protected]> * fix stats logging Signed-off-by: Surya Sashank Nistala <[email protected]> * add queryfieldnames field in findings mapping Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]> * optimize to fetch only fields relevant to doc level queries in doc level monitor instead of entire _source for each doc (#1441) * optimize to fetch only fields relevant to doc level queries in doc level monitor Signed-off-by: Surya Sashank Nistala <[email protected]> * fix test for settings check Signed-off-by: Surya Sashank Nistala <[email protected]> * fix ktlint Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]> * clean up doc level queries on dry run (#1430) Signed-off-by: Joanne Wang <[email protected]> * optimize sequence number calculation and reduce search requests in doc level monitor execution (#1445) * optimize sequence number calculation and reduce search requests by n where n is number of shards being queried in the executino Signed-off-by: Surya Sashank Nistala <[email protected]> * fix tests Signed-off-by: Surya Sashank Nistala <[email protected]> * optimize check indices and execute to query only write index of aliases and datastreams during monitor creation Signed-off-by: Surya Sashank Nistala <[email protected]> * fix test Signed-off-by: Surya Sashank Nistala <[email protected]> * add javadoc Signed-off-by: Surya Sashank Nistala <[email protected]> * add tests to verify seq_no calculation Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]> Signed-off-by: Joanne Wang <[email protected]> Co-authored-by: Joanne Wang <[email protected]>
…ject#1307) * Added 2.11.1 release notes. * Added 2.11.1 release notes. --------- (cherry picked from commit 06c1b8a) Signed-off-by: AWSHurneyt <[email protected]> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> fix workflow security tests in alerting (opensearch-project#1310) (opensearch-project#1311) Signed-off-by: Subhobrata Dey <[email protected]> Increment version to 2.12.0-SNAPSHOT (opensearch-project#1239) Signed-off-by: opensearch-ci-bot <[email protected]> Co-authored-by: opensearch-ci-bot <[email protected]> [Backport 2.x] Reference get monitor and search monitor action / request / responses from common-utils (opensearch-project#1315) * Use get monitor action / req / resp from common-utils Signed-off-by: Tyler Ohlsen <[email protected]> * Dummy commit to retrigger Signed-off-by: Tyler Ohlsen <[email protected]> --------- Signed-off-by: Tyler Ohlsen <[email protected]> optimize doc-level monitor execution workflow for datastreams (opensearch-project#1302) (opensearch-project#1322) Signed-off-by: Subhobrata Dey <[email protected]> Update to Gradle 8.5 (opensearch-project#1369) (opensearch-project#1371) Signed-off-by: Andriy Redko <[email protected]> [Backport 2.x] Inject namedWriteableRegistry during ser/deser of SearchMonitorAction (opensearch-project#1382) (opensearch-project#1384) * Inject namedWriteableRegistry during ser/deser of SearchMonitorAction (opensearch-project#1382) Signed-off-by: Tyler Ohlsen <[email protected]> * remove bin files Signed-off-by: Tyler Ohlsen <[email protected]> * remove core bin Signed-off-by: Tyler Ohlsen <[email protected]> --------- Signed-off-by: Tyler Ohlsen <[email protected]> Don't attempt to parse workflow if it doesn't exist (opensearch-project#1346) (opensearch-project#1359) (cherry picked from commit 733fd4e) Signed-off-by: Chase Engelbrecht <[email protected]> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Set docData to empty string if actual is null (opensearch-project#1325) (opensearch-project#1334) (cherry picked from commit 008e076) Signed-off-by: Chase Engelbrecht <[email protected]> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> removed default admin credentials for alerting (opensearch-project#1399) (opensearch-project#1400) (cherry picked from commit 3c50f7d) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Dennis Toepker <[email protected]> ipaddress lib upgrade as part of cve fix (opensearch-project#1397) (opensearch-project#1407) (cherry picked from commit 8d59060) Signed-off-by: Riya Saxena <[email protected]> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Bulk index findings and sequentially invoke auto-correlations (opensearch-project#1355) (opensearch-project#1410) * Bulk index findings and sequentially invoke auto-correlations * Bulk index findings in batches of 10000 and make it configurable * Addressing review comments * Add integ tests to test bulk index findings * Fix ktlint formatting --------- (cherry picked from commit b561965) Signed-off-by: Megha Goyal <[email protected]> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Add 2.12 release notes (opensearch-project#1408) (opensearch-project#1413) * Add 2.12 release notes * Fix release notes PR * Add 2 more PRs --------- (cherry picked from commit b10eaad) Signed-off-by: Chase Engelbrecht <[email protected]> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> [Backport 2.x] Implemented cross-cluster monitor support (opensearch-project#1404) (opensearch-project#1412) * Implemented cross-cluster monitor support (opensearch-project#1404) * Updated alert mappings to accommodate cross-cluster cluster metrics monitors. Signed-off-by: AWSHurneyt <[email protected]> * Implemented support for cross-cluster cluster metrics monitors. Implemented GetRemoteIndexes API to populate the frontend UI with details regarding the remote clusters, and indexes. Signed-off-by: AWSHurneyt <[email protected]> * Fixed a writeable test after changing QueryLevelTriggerRunResult from a data class to an open class for inheritability. Signed-off-by: AWSHurneyt <[email protected]> * Fixed ktlint errors. Signed-off-by: AWSHurneyt <[email protected]> * Removed changes to IndexUtils as they're only needed by doc monitors. Signed-off-by: AWSHurneyt <[email protected]> --------- Signed-off-by: AWSHurneyt <[email protected]> (cherry picked from commit ea36996) Signed-off-by: AWSHurneyt <[email protected]> * Fixed a test. Signed-off-by: AWSHurneyt <[email protected]> --------- Signed-off-by: AWSHurneyt <[email protected]> Add publishToMavenLocal in build.sh (opensearch-project#1418) (opensearch-project#1419) (cherry picked from commit 4cdc1d1) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> fix for MapperException[the [enabled] parameter can't be updated for the object mapping [metadata.source_to_query_index_mapping] (opensearch-project#1432) (opensearch-project#1434) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> bacport PRs opensearch-project#1445, opensearch-project#1430, opensearch-project#1441, opensearch-project#1435 to 2.x (opensearch-project#1452) * Add jvm aware setting and max num docs settings for batching docs for percolate queries (opensearch-project#1435) * add jvm aware and max docs settings for batching docs for percolate queries Signed-off-by: Surya Sashank Nistala <[email protected]> * fix stats logging Signed-off-by: Surya Sashank Nistala <[email protected]> * add queryfieldnames field in findings mapping Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]> * optimize to fetch only fields relevant to doc level queries in doc level monitor instead of entire _source for each doc (opensearch-project#1441) * optimize to fetch only fields relevant to doc level queries in doc level monitor Signed-off-by: Surya Sashank Nistala <[email protected]> * fix test for settings check Signed-off-by: Surya Sashank Nistala <[email protected]> * fix ktlint Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]> * clean up doc level queries on dry run (opensearch-project#1430) Signed-off-by: Joanne Wang <[email protected]> * optimize sequence number calculation and reduce search requests in doc level monitor execution (opensearch-project#1445) * optimize sequence number calculation and reduce search requests by n where n is number of shards being queried in the executino Signed-off-by: Surya Sashank Nistala <[email protected]> * fix tests Signed-off-by: Surya Sashank Nistala <[email protected]> * optimize check indices and execute to query only write index of aliases and datastreams during monitor creation Signed-off-by: Surya Sashank Nistala <[email protected]> * fix test Signed-off-by: Surya Sashank Nistala <[email protected]> * add javadoc Signed-off-by: Surya Sashank Nistala <[email protected]> * add tests to verify seq_no calculation Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]> Signed-off-by: Joanne Wang <[email protected]> Co-authored-by: Joanne Wang <[email protected]> [Backport 2.x] Add an _exists_ check to document level monitor queries (opensearch-project#1425) (opensearch-project#1456) * Add an _exists_ check to document level monitor queries (opensearch-project#1425) * clean up and add integ tests Signed-off-by: Joanne Wang <[email protected]> * refactored out common method and renamed test Signed-off-by: Joanne Wang <[email protected]> * remove _exists_ flag Signed-off-by: Joanne Wang <[email protected]> --------- Signed-off-by: Joanne Wang <[email protected]> * fix integ test Signed-off-by: Joanne Wang <[email protected]> --------- Signed-off-by: Joanne Wang <[email protected]> add distributed locking to jobs in alerting (opensearch-project#1403) (opensearch-project#1458) Signed-off-by: Subhobrata Dey <[email protected]>
The backport to
To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/alerting/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/alerting/backport-2.x
# Create a new branch
git switch --create backport-1445-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 faaf5520a71ce1b213968eadace6e7689320d997
# Push it to GitHub
git push --set-upstream origin backport-1445-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/alerting/backport-2.x Then, create a pull request where the |
…c level monitor execution (opensearch-project#1445) * optimize sequence number calculation and reduce search requests by n where n is number of shards being queried in the executino Signed-off-by: Surya Sashank Nistala <[email protected]> * fix tests Signed-off-by: Surya Sashank Nistala <[email protected]> * optimize check indices and execute to query only write index of aliases and datastreams during monitor creation Signed-off-by: Surya Sashank Nistala <[email protected]> * fix test Signed-off-by: Surya Sashank Nistala <[email protected]> * add javadoc Signed-off-by: Surya Sashank Nistala <[email protected]> * add tests to verify seq_no calculation Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]>
…c level monitor execution (#1445) * optimize sequence number calculation and reduce search requests by n where n is number of shards being queried in the executino Signed-off-by: Surya Sashank Nistala <[email protected]> * fix tests Signed-off-by: Surya Sashank Nistala <[email protected]> * optimize check indices and execute to query only write index of aliases and datastreams during monitor creation Signed-off-by: Surya Sashank Nistala <[email protected]> * fix test Signed-off-by: Surya Sashank Nistala <[email protected]> * add javadoc Signed-off-by: Surya Sashank Nistala <[email protected]> * add tests to verify seq_no calculation Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]>
…#1471) * Bulk index findings and sequentially invoke auto-correlations (#1355) * Bulk index findings and sequentially invoke auto-correlations Signed-off-by: Megha Goyal <[email protected]> * Bulk index findings in batches of 10000 and make it configurable Signed-off-by: Megha Goyal <[email protected]> * Addressing review comments Signed-off-by: Megha Goyal <[email protected]> * Add integ tests to test bulk index findings Signed-off-by: Megha Goyal <[email protected]> * Fix ktlint formatting Signed-off-by: Megha Goyal <[email protected]> --------- Signed-off-by: Megha Goyal <[email protected]> * Add jvm aware setting and max num docs settings for batching docs for percolate queries (#1435) * add jvm aware and max docs settings for batching docs for percolate queries Signed-off-by: Surya Sashank Nistala <[email protected]> * fix stats logging Signed-off-by: Surya Sashank Nistala <[email protected]> * add queryfieldnames field in findings mapping Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]> * optimize to fetch only fields relevant to doc level queries in doc level monitor instead of entire _source for each doc (#1441) * optimize to fetch only fields relevant to doc level queries in doc level monitor Signed-off-by: Surya Sashank Nistala <[email protected]> * fix test for settings check Signed-off-by: Surya Sashank Nistala <[email protected]> * fix ktlint Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]> * fix integTests Signed-off-by: Joanne Wang <[email protected]> * clean up doc level queries on dry run (#1430) Signed-off-by: Joanne Wang <[email protected]> * optimize sequence number calculation and reduce search requests in doc level monitor execution (#1445) * optimize sequence number calculation and reduce search requests by n where n is number of shards being queried in the executino Signed-off-by: Surya Sashank Nistala <[email protected]> * fix tests Signed-off-by: Surya Sashank Nistala <[email protected]> * optimize check indices and execute to query only write index of aliases and datastreams during monitor creation Signed-off-by: Surya Sashank Nistala <[email protected]> * fix test Signed-off-by: Surya Sashank Nistala <[email protected]> * add javadoc Signed-off-by: Surya Sashank Nistala <[email protected]> * add tests to verify seq_no calculation Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]> * fix integ tests again Signed-off-by: Joanne Wang <[email protected]> --------- Signed-off-by: Megha Goyal <[email protected]> Signed-off-by: Surya Sashank Nistala <[email protected]> Signed-off-by: Joanne Wang <[email protected]> Co-authored-by: Megha Goyal <[email protected]> Co-authored-by: Surya Sashank Nistala <[email protected]>
…c level monitor execution (opensearch-project#1445) * optimize sequence number calculation and reduce search requests by n where n is number of shards being queried in the executino Signed-off-by: Surya Sashank Nistala <[email protected]> * fix tests Signed-off-by: Surya Sashank Nistala <[email protected]> * optimize check indices and execute to query only write index of aliases and datastreams during monitor creation Signed-off-by: Surya Sashank Nistala <[email protected]> * fix test Signed-off-by: Surya Sashank Nistala <[email protected]> * add javadoc Signed-off-by: Surya Sashank Nistala <[email protected]> * add tests to verify seq_no calculation Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]>
…c level monitor execution (opensearch-project#1445) * optimize sequence number calculation and reduce search requests by n where n is number of shards being queried in the executino Signed-off-by: Surya Sashank Nistala <[email protected]> * fix tests Signed-off-by: Surya Sashank Nistala <[email protected]> * optimize check indices and execute to query only write index of aliases and datastreams during monitor creation Signed-off-by: Surya Sashank Nistala <[email protected]> * fix test Signed-off-by: Surya Sashank Nistala <[email protected]> * add javadoc Signed-off-by: Surya Sashank Nistala <[email protected]> * add tests to verify seq_no calculation Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]>
* use notiifcation snapshot for integ test (#822) Signed-off-by: Ashish Agrawal <[email protected]> * Fix build Signed-off-by: Chase Engelbrecht <[email protected]> * optimize doc-level monitor workflow for index patterns Signed-off-by: Subhobrata Dey <[email protected]> * optimize doc-level monitor execution workflow for datastreams (#1302) * optimize doc-level monitor execution for datastreams Signed-off-by: Subhobrata Dey <[email protected]> * add more tests to address comments Signed-off-by: Subhobrata Dey <[email protected]> * add integTest for multiple datastreams inside a single index pattern * add integTest for multiple datastreams inside a single index pattern Signed-off-by: Subhobrata Dey <[email protected]> --------- Signed-off-by: Subhobrata Dey <[email protected]> * Add jvm aware setting and max num docs settings for batching docs for percolate queries (#1435) * add jvm aware and max docs settings for batching docs for percolate queries Signed-off-by: Surya Sashank Nistala <[email protected]> * fix stats logging Signed-off-by: Surya Sashank Nistala <[email protected]> * add queryfieldnames field in findings mapping Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]> * optimize to fetch only fields relevant to doc level queries in doc level monitor instead of entire _source for each doc (#1441) * optimize to fetch only fields relevant to doc level queries in doc level monitor Signed-off-by: Surya Sashank Nistala <[email protected]> * fix test for settings check Signed-off-by: Surya Sashank Nistala <[email protected]> * fix ktlint Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]> * optimize sequence number calculation and reduce search requests in doc level monitor execution (#1445) * optimize sequence number calculation and reduce search requests by n where n is number of shards being queried in the executino Signed-off-by: Surya Sashank Nistala <[email protected]> * fix tests Signed-off-by: Surya Sashank Nistala <[email protected]> * optimize check indices and execute to query only write index of aliases and datastreams during monitor creation Signed-off-by: Surya Sashank Nistala <[email protected]> * fix test Signed-off-by: Surya Sashank Nistala <[email protected]> * add javadoc Signed-off-by: Surya Sashank Nistala <[email protected]> * add tests to verify seq_no calculation Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]> * clean up doc level queries on dry run (#1430) Signed-off-by: Joanne Wang <[email protected]> * Fix some tests from cherry-picks Signed-off-by: Chase Engelbrecht <[email protected]> --------- Signed-off-by: Ashish Agrawal <[email protected]> Signed-off-by: Chase Engelbrecht <[email protected]> Signed-off-by: Subhobrata Dey <[email protected]> Signed-off-by: Surya Sashank Nistala <[email protected]> Signed-off-by: Joanne Wang <[email protected]> Co-authored-by: Ashish Agrawal <[email protected]> Co-authored-by: Subhobrata Dey <[email protected]> Co-authored-by: Surya Sashank Nistala <[email protected]> Co-authored-by: Joanne Wang <[email protected]>
…, #1441 to 2.9 (#1469) * optimize doc-level monitor execution workflow for datastreams (#1302) * optimize doc-level monitor execution for datastreams Signed-off-by: Subhobrata Dey <[email protected]> * add more tests to address comments Signed-off-by: Subhobrata Dey <[email protected]> * add integTest for multiple datastreams inside a single index pattern * add integTest for multiple datastreams inside a single index pattern Signed-off-by: Subhobrata Dey <[email protected]> --------- Signed-off-by: Subhobrata Dey <[email protected]> * Bulk index findings and sequentially invoke auto-correlations (#1355) * Bulk index findings and sequentially invoke auto-correlations Signed-off-by: Megha Goyal <[email protected]> * Bulk index findings in batches of 10000 and make it configurable Signed-off-by: Megha Goyal <[email protected]> * Addressing review comments Signed-off-by: Megha Goyal <[email protected]> * Add integ tests to test bulk index findings Signed-off-by: Megha Goyal <[email protected]> * Fix ktlint formatting Signed-off-by: Megha Goyal <[email protected]> --------- Signed-off-by: Megha Goyal <[email protected]> * fix for MapperException[the [enabled] parameter can't be updated for the object mapping [metadata.source_to_query_index_mapping] (#1432) (#1434) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * Add jvm aware setting and max num docs settings for batching docs for percolate queries (#1435) * add jvm aware and max docs settings for batching docs for percolate queries Signed-off-by: Surya Sashank Nistala <[email protected]> * fix stats logging Signed-off-by: Surya Sashank Nistala <[email protected]> * add queryfieldnames field in findings mapping Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]> * clean up doc level queries on dry run (#1430) Signed-off-by: Joanne Wang <[email protected]> * optimize to fetch only fields relevant to doc level queries in doc level monitor instead of entire _source for each doc (#1441) * optimize to fetch only fields relevant to doc level queries in doc level monitor Signed-off-by: Surya Sashank Nistala <[email protected]> * fix test for settings check Signed-off-by: Surya Sashank Nistala <[email protected]> * fix ktlint Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]> * optimize sequence number calculation and reduce search requests in doc level monitor execution (#1445) * optimize sequence number calculation and reduce search requests by n where n is number of shards being queried in the executino Signed-off-by: Surya Sashank Nistala <[email protected]> * fix tests Signed-off-by: Surya Sashank Nistala <[email protected]> * optimize check indices and execute to query only write index of aliases and datastreams during monitor creation Signed-off-by: Surya Sashank Nistala <[email protected]> * fix test Signed-off-by: Surya Sashank Nistala <[email protected]> * add javadoc Signed-off-by: Surya Sashank Nistala <[email protected]> * add tests to verify seq_no calculation Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]> * add distributed locking to jobs in alerting (#1403) Signed-off-by: Subhobrata Dey <[email protected]> * 2.9.1 version bump Signed-off-by: Surya Sashank Nistala <[email protected]> * fix compilation issues Signed-off-by: Surya Sashank Nistala <[email protected]> * dummy commit Signed-off-by: Surya Sashank Nistala <[email protected]> * fix findings index schema version tsts Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Subhobrata Dey <[email protected]> Signed-off-by: Megha Goyal <[email protected]> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Signed-off-by: Surya Sashank Nistala <[email protected]> Signed-off-by: Joanne Wang <[email protected]> Co-authored-by: Subhobrata Dey <[email protected]> Co-authored-by: Megha Goyal <[email protected]> Co-authored-by: opensearch-trigger-bot[bot] <98922864+opensearch-trigger-bot[bot]@users.noreply.github.com> Co-authored-by: Joanne Wang <[email protected]>
…c level monitor execution (opensearch-project#1445) * optimize sequence number calculation and reduce search requests by n where n is number of shards being queried in the executino Signed-off-by: Surya Sashank Nistala <[email protected]> * fix tests Signed-off-by: Surya Sashank Nistala <[email protected]> * optimize check indices and execute to query only write index of aliases and datastreams during monitor creation Signed-off-by: Surya Sashank Nistala <[email protected]> * fix test Signed-off-by: Surya Sashank Nistala <[email protected]> * add javadoc Signed-off-by: Surya Sashank Nistala <[email protected]> * add tests to verify seq_no calculation Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]>
…c level monitor execution (opensearch-project#1445) * optimize sequence number calculation and reduce search requests by n where n is number of shards being queried in the executino Signed-off-by: Surya Sashank Nistala <[email protected]> * fix tests Signed-off-by: Surya Sashank Nistala <[email protected]> * optimize check indices and execute to query only write index of aliases and datastreams during monitor creation Signed-off-by: Surya Sashank Nistala <[email protected]> * fix test Signed-off-by: Surya Sashank Nistala <[email protected]> * add javadoc Signed-off-by: Surya Sashank Nistala <[email protected]> * add tests to verify seq_no calculation Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]> Signed-off-by: Chase Engelbrecht <[email protected]>
* log error messages and clean up monitor when indexing doc level queries or metadata creation fails (#900) * log errors and clean up monitor when indexing doc level queries or metadata creation fails * refactor delete monitor action to re-use delete methods Signed-off-by: Surya Sashank Nistala <[email protected]> Signed-off-by: Chase Engelbrecht <[email protected]> * optimize doc-level monitor workflow for index patterns (#1097) Signed-off-by: Subhobrata Dey <[email protected]> Signed-off-by: Chase Engelbrecht <[email protected]> * optimize doc-level monitor execution workflow for datastreams (#1302) * optimize doc-level monitor execution for datastreams Signed-off-by: Subhobrata Dey <[email protected]> * add more tests to address comments Signed-off-by: Subhobrata Dey <[email protected]> * add integTest for multiple datastreams inside a single index pattern * add integTest for multiple datastreams inside a single index pattern Signed-off-by: Subhobrata Dey <[email protected]> --------- Signed-off-by: Subhobrata Dey <[email protected]> Signed-off-by: Chase Engelbrecht <[email protected]> * Bulk index findings and sequentially invoke auto-correlations (#1355) * Bulk index findings and sequentially invoke auto-correlations Signed-off-by: Megha Goyal <[email protected]> * Bulk index findings in batches of 10000 and make it configurable Signed-off-by: Megha Goyal <[email protected]> * Addressing review comments Signed-off-by: Megha Goyal <[email protected]> * Add integ tests to test bulk index findings Signed-off-by: Megha Goyal <[email protected]> * Fix ktlint formatting Signed-off-by: Megha Goyal <[email protected]> --------- Signed-off-by: Megha Goyal <[email protected]> Signed-off-by: Chase Engelbrecht <[email protected]> * Add jvm aware setting and max num docs settings for batching docs for percolate queries (#1435) * add jvm aware and max docs settings for batching docs for percolate queries Signed-off-by: Surya Sashank Nistala <[email protected]> * fix stats logging Signed-off-by: Surya Sashank Nistala <[email protected]> * add queryfieldnames field in findings mapping Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]> Signed-off-by: Chase Engelbrecht <[email protected]> * optimize to fetch only fields relevant to doc level queries in doc level monitor instead of entire _source for each doc (#1441) * optimize to fetch only fields relevant to doc level queries in doc level monitor Signed-off-by: Surya Sashank Nistala <[email protected]> * fix test for settings check Signed-off-by: Surya Sashank Nistala <[email protected]> * fix ktlint Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]> Signed-off-by: Chase Engelbrecht <[email protected]> * optimize sequence number calculation and reduce search requests in doc level monitor execution (#1445) * optimize sequence number calculation and reduce search requests by n where n is number of shards being queried in the executino Signed-off-by: Surya Sashank Nistala <[email protected]> * fix tests Signed-off-by: Surya Sashank Nistala <[email protected]> * optimize check indices and execute to query only write index of aliases and datastreams during monitor creation Signed-off-by: Surya Sashank Nistala <[email protected]> * fix test Signed-off-by: Surya Sashank Nistala <[email protected]> * add javadoc Signed-off-by: Surya Sashank Nistala <[email protected]> * add tests to verify seq_no calculation Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]> Signed-off-by: Chase Engelbrecht <[email protected]> * Fix tests Signed-off-by: Chase Engelbrecht <[email protected]> * Fix BWC tests Signed-off-by: Chase Engelbrecht <[email protected]> * clean up doc level queries on dry run (#1430) Signed-off-by: Joanne Wang <[email protected]> Signed-off-by: Chase Engelbrecht <[email protected]> * Fix import Signed-off-by: Chase Engelbrecht <[email protected]> * Fix tests Signed-off-by: Chase Engelbrecht <[email protected]> * Fix BWC version Signed-off-by: Chase Engelbrecht <[email protected]> * Fix another test Signed-off-by: Chase Engelbrecht <[email protected]> * Revert order of operations change Signed-off-by: Chase Engelbrecht <[email protected]> --------- Signed-off-by: Subhobrata Dey <[email protected]> Signed-off-by: Chase Engelbrecht <[email protected]> Signed-off-by: Megha Goyal <[email protected]> Signed-off-by: Surya Sashank Nistala <[email protected]> Signed-off-by: Joanne Wang <[email protected]> Co-authored-by: Surya Sashank Nistala <[email protected]> Co-authored-by: Subhobrata Dey <[email protected]> Co-authored-by: Megha Goyal <[email protected]> Co-authored-by: Joanne Wang <[email protected]>
Optimize sequence number calculation and reduce search requests by n, where n is number of shards being queried in the execution
Issue #, if available:
#1409
Description of changes:
Use the first shard search request on each shard to compute max sequence number instead of making a search call simply to calculate the value.
Fetch data in descending order of sequence numbers to directly calculate the max seq_no
New heuristic:
take an example of one shard
in current iteration: previous seq_no = 10000, current max seq_no= 40000
first iteration of search shard(which also determines the max sequence number) : TO=int_max, FROM=prev_seq_no : calculate max_seq number
second iteration : TO=max_seq_no-10000 FROM=prev_seq_no
further iterations : TO=TO-10000 FROM=prev_seq_no till while
to>from
in first iteration we get first 10k results and also calculate the max sequence number
in second iteration we reduce (TO) variable to maxSeqNo -10000
and query again and we continue this loop until we have seen all docs up untill the previous sequence number
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.